Legal case document similarity: You need both network and text

نویسندگان

چکیده

Estimating the similarity between two legal case documents is an important and challenging problem, having various downstream applications such as prior-case retrieval citation recommendation. There are broad approaches for task — network-based text-based. Prior consider citations only to prior-cases (also called precedents) (PCNet). This approach misses signals inherent in Statutes (written laws of a jurisdiction). In this work, we propose Hier-SPCNet that augments PCNet with heterogeneous network Statutes. We incorporate domain knowledge document into Hier-SPCNet, thereby obtaining state-of-the-art results similarity. Both textual provide similarity; but till now, trivial attempts have been made unify signals. apply several methods combining information estimating perform extensive experiments over from Indian judiciary, where gold standard document-pairs judged by law experts reputed Law institutes India. Our establish our proposed significantly improve correlation experts’ opinion when compared existing best-performing combination method (that combines text-based similarity) improves 11.8% best 20.6% method. also can be used recommend/retrieve citable similar cases source (query) case, which well appreciated experts. • practically useful task. substantially demonstrate utility recommending documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity Measures for Text Document Clustering

Clustering is a useful technique that organizes a large quantity of unordered text documents into a small number of meaningful and coherent clusters, thereby providing a basis for intuitive and informative navigation and browsing mechanisms. Partitional clustering algorithms have been recognized to be more suitable as opposed to the hierarchical clustering schemes for processing large datasets....

متن کامل

Automatic Term Extraction and Document Similarity in Special Text Corpora

This paper confirms that the performance of a state-of-the-art automatic term extraction method on a computer science corpus is similar to previously published performance data on a medical corpus. The extracted terms are then used to estimate the similarity of papers in the computer science corpus using the standard Vector Space Model. The precision of retrieval using a term-based representati...

متن کامل

A Text Similarity Approach for Precedence Retrieval from Legal Documents

Precedence retrieval of legal documents is an information retrieval task to retrieve prior case documents that are related to a given case document. This helps in automatic linking of related documents to ensure that identical situations are treated similarly in every case. Several methodologies, such as information extraction based on natural language processing, rule-based method, and machine...

متن کامل

Parametric or Direct Modeling: why you may need both

متن کامل

Vertical Bar Detection for Gauging Text Similarity of Document Images

A new method for gauging text similarity of image-based document using word shape recognition is proposed in this paper. Image features are directly extracted instead of using OCR (optical character recognition). The proposed method forms so-called vertical bar patterns by detecting local extrema points in word units extracted by segmenting the document images. These vertical bar patterns form ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Processing and Management

سال: 2022

ISSN: ['0306-4573', '1873-5371']

DOI: https://doi.org/10.1016/j.ipm.2022.103069